Task 1

Get WD

getwd()
## [1] "/Users/tristanlarose/Desktop/R-Stuff/MATH4753TRISTAN2023/R/Lab 8"

Task 2

runif

a=0;
b=5;


sample<-runif(10,a,b)
sample
##  [1] 2.81486637 1.16004337 2.42515295 4.63927998 3.09956834 1.03697695
##  [7] 1.53040686 0.81914339 0.09322002 2.48561601

mean and vairance

mu = (a+b)/2
var = (b-a)^2/12

c(mu,var)
## [1] 2.500000 2.083333

x-bar and \(x^2\)

xbar<-mean(sample)
svar<-var(sample)

c(xbar,svar)
## [1] 2.010427 1.793115

The sample mean is near to \(\mu\) the sample variance usually is close to \(\sigma^2\), but smaller.

From what we know, we get \(E(\bar Y) = E(Y_i) = \mu = (a+b)/2, E(T)=nE(Y_i)=n\mu=n(a+b)/2,V(Y_i)=(1/n)V(Y_i)=(1/n)\sigma^2=(b-a)^2/(12n), and V(T)=nV(Y_i)=n\sigma^2=n(b-a)^2/12\)

We have myclt, which I have done as clt:

clt=function(n,iter){
y=runif(n*iter,a,b)
data=matrix(y,nr=n,nc=iter,byrow=TRUE)
sm=apply(data,2,sum)
h=hist(sm,plot=FALSE)
hist(sm,col=rainbow(length(h$mids)),freq=FALSE,main="Distribution of the sum of uniforms")
curve(dnorm(x,mean=n*(a+b)/2,sd=sqrt(n*(b-a)^2/12)),add=TRUE,lwd=2,col="Blue")
sm
}
w=clt(n=10,iter=10000)

Line A: we made a random sample of the size n * iter from the uniform distribution that has parameters a and b. Line B: the sample is put into a matrix with the dimensions of n x iter, splitting the large sample into iter samples of n.  Line C: this sums the columns of the aforementioned matrix. Line D: we invoke the function with n=10 and iter=10000 and put the output in w.

Sample estimates

wbar = mean(w)
wvar = var(w)

c(wbar,wvar)
## [1] 24.91263 21.08435

Modifications

clt.mean=function(n,iter){
y=runif(n*iter,a,b)
data=matrix(y,nr=n,nc=iter,byrow=TRUE)
m=apply(data,2,mean)
h=hist(m,plot=FALSE)
hist(m,col=rainbow(length(h$mids)),freq=FALSE,main="Distribution of the mean of uniforms")
curve(dnorm(x,mean=n*(a+b)/2,sd=sqrt(n*(b-a)^2/12)),add=TRUE,lwd=2,col="Blue")
m
}
w=clt.mean(n=10,iter=10000)

Task 3

mycltu

### CLT uniform 
## my Central Limit Function
## Notice that I have assigned default values which can be changed when the function is called
mycltu=function(n,iter,a=0,b=10){
## r-random sample from the uniform
y=runif(n*iter,a,b)
## Place these numbers into a matrix
## The columns will correspond to the iteration and the rows will equal the sample size n
data=matrix(y,nr=n,nc=iter,byrow=TRUE)
## apply the function mean to the columns (2) of the matrix
## these are placed in a vector w
w=apply(data,2,mean)
## We will make a histogram of the values in w
## How high should we make y axis?
## All the values used to make a histogram are placed in param (nothing is plotted yet)
param=hist(w,plot=FALSE)
## Since the histogram will be a density plot we will find the max density

ymax=max(param$density)
## To be on the safe side we will add 10% more to this
ymax=1.1*ymax
## Now we can make the histogram
hist(w,freq=FALSE,  ylim=c(0,ymax), main=paste("Histogram of sample mean",
"\n", "sample size= ",n,sep=""),xlab="Sample mean")
## add a density curve made from the sample distribution
lines(density(w),col="Blue",lwd=3) # add a density plot
## Add a theoretical normal curve 
curve(dnorm(x,mean=(a+b)/2,sd=(b-a)/(sqrt(12*n))),add=TRUE,col="Red",lty=2,lwd=3) # add a theoretical curve
## Add the density from which the samples were taken
curve(dunif(x,a,b),add=TRUE,lwd=4)

}

The \(apply\) funtion’s second argument tells the function to operate on the rows or collumns, so being set to 2 means it will operate on the collumns.

With n=20 and iter=100000, y is a sample of size n * iter, so the matrix contains this many elements (2000000).

We know that a uniform distribution’s variance is \(\sigma^2_Y=(b-a)^2/12\), which tells us that \(\sigma_Y=(b-a)/\sqrt{12}\). Since \(V(\bar Y) =(1/n)V(Y)\), we can also say that \(\sigma_{\bar Y}=(b-a)/(\sqrt{12n})\).

for (n in c(1,2,3,5,10,30)) {
  mycltu(n = n, iter = 10000, a = 0, b = 10)
}

The uniform distribution starts to look a lot like the uniform distribution, even as low as n = 3 or n = 5. The mean of the sampling mean appears to hold constant. Therefore, the Central Limit Theorem is supported.

Task 4

mycltb

## CLT Binomial
## CLT will work with discrete or continuous distributions 
## my Central Limit Function
## Notice that I have assigned default values which can be changed when the function is called

mycltb=function(n,iter,p=0.5,...){

## r-random sample from the Binomial
y=rbinom(n*iter,size=n,prob=p)
## Place these numbers into a matrix
## The columns will correspond to the iteration and the rows will equal the sample size n
data=matrix(y,nr=n,nc=iter,byrow=TRUE)
## apply the function mean to the columns (2) of the matrix
## these are placed in a vector w
w=apply(data,2,mean)
## We will make a histogram of the values in w
## How high should we make y axis?
## All the values used to make a histogram are placed in param (nothing is plotted yet)
param=hist(w,plot=FALSE)
## Since the histogram will be a density plot we will find the max density

ymax=max(param$density)
## To be on the safe side we will add 10% more to this
ymax=1.1*ymax

## Now we can make the histogram
## freq=FALSE means take a density
hist(w,freq=FALSE,  ylim=c(0,ymax),
main=paste("Histogram of sample mean","\n", "sample size= ",n,sep=""),
xlab="Sample mean",...)
## add a density curve made from the sample distribution
#lines(density(w),col="Blue",lwd=3) # add a density plot
## Add a theoretical normal curve 
curve(dnorm(x,mean=n*p,sd=sqrt(p*(1-p))),add=TRUE,col="Red",lty=2,lwd=3) 

}

cltb.p <- function(p=0.4,...) {
  for (n in c(4,5,10,20)) {
    mycltb(n = n, iter = 10000, p = p, ...)
  }
}
cltb.p(0.3);

cltb.p(0.7);

cltb.p(0.5);

Once we get up to about n=5, then the sampling distributions from the binomial distribution start to look Normal. The mean varies with both p and n because it is dependent on both as a parameter. The CLT holds from these observations.

Task 5

mycltp

####### Poisson ######################

## CLT Poisson
## CLT will work with discrete or continuous distributions 
## my Central Limit Function
## Notice that I have assigned default values which can be changed when the function is called

mycltp=function(n,iter,lambda=10,...){

## r-random sample from the Poisson
y=rpois(n*iter,lambda=lambda)
## Place these numbers into a matrix
## The columns will correspond to the iteration and the rows will equal the sample size n
data=matrix(y,nr=n,nc=iter,byrow=TRUE)
## apply the function mean to the columns (2) of the matrix
## these are placed in a vector w
w=apply(data,2,mean)
## We will make a histogram of the values in w
## How high should we make y axis?
## All the values used to make a histogram are placed in param (nothing is plotted yet)
param=hist(w,plot=FALSE)
## Since the histogram will be a density plot we will find the max density

ymax=max(param$density)
## To be on the safe side we will add 10% more to this
ymax=1.1*ymax

## Make a suitable layout for graphing
layout(matrix(c(1,1,2,3),nr=2,nc=2, byrow=TRUE))

## Now we can make the histogram
hist(w,freq=FALSE,  ylim=c(0,ymax), col=rainbow(max(w)),
main=paste("Histogram of sample mean","\n", "sample size= ",n," iter=",iter," lambda=",lambda,sep=""),
xlab="Sample mean",...)
## add a density curve made from the sample distribution
#lines(density(w),col="Blue",lwd=3) # add a density plot
## Add a theoretical normal curve 
curve(dnorm(x,mean=lambda,sd=sqrt(lambda/n)),add=TRUE,col="Red",lty=2,lwd=3) # add a theoretical curve

# Now make a new plot
# Since y is discrete we should use a barplot
barplot(table(y)/(n*iter),col=rainbow(max(y)), main="Barplot of sampled y", ylab ="Rel. Freq",xlab="y" )
x=0:max(y)
plot(x,dpois(x,lambda=lambda),type="h",lwd=5,col=rainbow(max(y)),
main="Probability function for Poisson", ylab="Probability",xlab="y")
}


  for (n in c(2,3,5,10,20)) {
    mycltp(n = n, iter = 10000, lambda = 4)
  }

for (n in c(2,3,5,10,20)) {
    mycltp(n = n, iter = 10000, lambda = 10)
  }

Task 6

library("MATH4753TRISTAN2023");
mcltt <- myclt(n=5,iter=10000)